home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Aminet 24
/
Aminet 24 (1998)(GTI - Schatztruhe)[!][Apr 1998].iso
/
Aminet
/
util
/
conv
/
Raw2Ent19.lha
/
Raw2Ent.doc
< prev
next >
Wrap
Text File
|
1998-01-19
|
25KB
|
723 lines
Project: Raw2Ent
ARexx : Raw2Ent.rexx
Version: 1.4.1 (14.07.96)
Program: Raw2Ent
Version: 1.9 (18.01.98)
Author : Tamio Patrick Honma
Files : Raw2Ent
Raw2Ent.ced
Raw2Ent.doc
Raw2Ent.dok
Raw2Ent.rexx
Raw2Ent.rexx.old
Raw2EntCheck
Raw2EntHTMLColors.iff
Raw2EntLogic.asc
Raw2EntMeta.asc
Raw2EntTables.doc
Raw2EntUml.r2e
CONTENTS:
1. INTRODUCTION
1.1. REQUIREMENTS
1.2. TECHNICAL INFORMATION
1.3. USER KNOWLEDGE
2. USAGE
2.1. Raw2Ent VER: 1.9 (18.01.98)
2.2. Raw2Ent.ced VER: 1.01 (26.12.96)
2.3. Raw2Ent.rexx VER: 1.4.1 (14.07.96)
2.4. Raw2EntCheck VER: 1.0 (12.12.96)
2.5. ARGUMENT-PRIORITY
3. LIMITATIONS
4. INSTALLATION
5. EXAMPLES
6. BYE!
7. LAST COMMENT
8. BUG REPORTS
9. HISTORY
10.SUPPORT & MS-DOS-VERSION
1. INTRODUCTION
Raw2Ent converts raw 8-Bit-ASCII-Text into 7-Bit-ASCII-Text with
entity-codes and reverse. The ASCII-Format is a standardized format for
information interchange, but it is only standardized seven-bit-wide, which
means that 128 codes are defined. One Byte consists of eight bits and can
represent 256 different bit combinations. Therefore the last 128 bit
combinations are defined for free use by any operation-system. The problem
is that accent-characters and other special characters are not standardized,
because they are defined in (guess where?! ;) ) the free part of ASCII by
the operation system developers.
The goal of the Wold Wide Web developers was that it could be used on every
important operation system. So it was clear that the ASCII-Based
HTML-Source-Code had to use the standardized seven-bit area of the
ASCII-Code. To represent accent-characters or other special characters in a
seven-bit-code, it was neccesary to invent something. And this was the
entity-code - a kind of escape-code. An entity-code consists of an
introducing "&" and a ";" at the end. Between these symbols is a
character-name the browser can interpret. It is a very hard and stupid work
to convert the ASCII-Text by hand. So just use Raw2Ent!
Raw2Ent produces real 7-Bit-ASCII-Code. All printable Amiga-characters in
the 8-bit-area will be converted into entity-codes, without any exception.
The use of names instead of code-numbers will make the entity-codes easier
to be read by humans.
You can use Raw2Ent also to check, wether your file contains pure seven-bit
codes or not. If not, Raw2Ent will present you the position of the
8-Bit-Characters in the text.
Raw2Ent can also handle color-codes and -names.
1.1. REQUIREMENTS
- AmigaOS 2.0 or greater
- optional: ARexx
- optional: Cygnus ED
1.2. TECHNICAL INFORMATION (Raw2Ent)
+-----------+--------------------------+--------------+--------------+
| Libraries | Code | OS-VERSION | Bytes |
+-----------+--------------------------+--------------+--------------+
| exec | program-counter-relative | 36 or higher | 1250 Stack |
| dos | MC 68000 CPU | | 11076 File |
| | not yet re-entrant! | | 13072 Memory |
| | | | + Datas |
+-----------+--------------------------+--------------+--------------+
1.3. USER KNOWLEDGE
Raw2Ent:
- AmigaOS Knowledge: CLI-commands, -arguments and standard-output.
- American Standardisation Commitee for Information Interchange (ASCII)
- "Bit", "Byte" -> 7Bit-ASCII, 8Bit-ASCII [i.e.: Amiga-ASCII]
- Hypertext Markup Language (HTML) & Character-Entity-Codes & Color-Codes
- Edit text with an ASCII-Editor
Raw2Ent.rexx:
- How to start and use ARexx-Scripts
- ... see Raw2Ent
Raw2Ent.ced:
- How to use CygnusED and how to implement CED-Scripts
- How to use ARexx
Raw2EntCheck:
- How to use batch-files with AmigaOS
- ... see Raw2Ent
2. USAGE
Raw2Ent consists of four parts: one assembler-program, two ARexx-Scripts
and one Batch-File.
If you just want to convert a text once, you just need the
assembler-program. If you want to convert one text more than one time
because you work on a project, like a web-page with actual information, the
ARexx-Script may be useful.
Raw2Ent can be used with Cygnus ED.
Raw2Ent will create a backup-file, if Raw2Ent overwrites another file. This
feature can be switched off by using the switch "NOBAK" or "NOBACKUP".
If you want to append one file to another use the AmigaDOS-convention.
Please note, that the "BACKUP"-Feature will take no effect to the
destination file!
You can use the RETURNBYTE-option to implement the Check-Mode of Raw2Ent
into editors like Cygnus ED.
2.1. Raw2Ent VER: 1.9 (18.01.98)
arguments:
FROM - The source-file (eight bit wide)
TO - The destination-file (with entity-codes)
[path without filename is not accepted]
DATA/K - definition-file
ENT/S - default mode
TAG/S - activates the TAG-Mode
SMART/S - activates the smart-mode
COPY=HTML/S - activates the HTML-Mode
UML=NOENT/S - removes high-bit characters by characters or words
CODE/S - converts all entity-codes by code-number
(except the four special entities)
TOTALCODE/S - converts ALL characters by entity-code-numbers
INVERSE=ENT2RAW/S - inverses the function of Raw2Ent to Ent2Raw
CHECK/S - checks, if your Text is pure 7-Bit-ASCII
COLCODE=COLORCODE/S - converts color-names to color-codes
COLHTML=COLORHTML/S - converts color-codes to HTML-Color-Names
COLORNETSCAPE/S - converts color-codes to NETSCAPE-Color-Names
LISTENT=LISTENTITY/S - lists the Entity-Table
LISTUML=LISTUMLAUT/S - lists the Umlaut-Table
LISTCOL=LISTCOLOR/S - lists the Color-Table (HTML 3.2)
LISTNET=LISTNETSCAPE/S- lists the Netscape-Color-Table (without HTML 3.2)
NOBAK=NOBACKUP/S - switches the backup feature off
RETBYTE=RETURNBYTE/S - returns byte-position as return-code in check-mode
modes:
>DATA-Mode<
loads one definition-file and uses this file as the convertion-table. If
you want to define one table on your own, you have to follow the
instructions very strictly!
Edit the datafile with an ASCII-Editor like that:
(8 Bit-character)=(expression)<LINE-FEED>
(8 Bit-character)=(expression)<LINE-FEED>
etc.
#<LINE-FEED>
EXAMPLE:
©=Copyright in
®=Produced on
¶=<P>
#
NOTE: YOU MUST STRICTLY FOLLOW THIS INSTRUCTION.
THIS MEANS:
- The 8-Bit character must be in the first column
- The "=" character must be in the second column
- The expression is free and must end with a line feed
- no empty lines allowed
- at the end of the definition must be an "#" in one line
ALL UNDEFINED 8-BIT-CHARACTERS WILL BE REPRESENTED BY <NULL>
>ENT-Mode<
is the default mode and converts every known character into its entity-code.
>TAG-Mode<
will not convert the four characters: & < > ". This is usefull for
ASCII-Text which already contains entity-codes or HTML-TAGS, which are
introduced and ended by "<" and ">" and which can contain quotes. The
"&"-character usually introduces the entity-codes. If you use the TAG-Mode
the entity-codes in the source-file will not be converted a second time in a
wrong way, but untouched special-characters will be converted. Therefore
you should use this mode, whenever you convert a text a second time.
>SMART-Mode<
is a combination of the >ENT<-Mode and the >TAG<-Mode. HTML-Files for
example will be converted without destruction of HTML-Tags and
character-entity-codes - like the >TAG<-Mode. The difference is that the
characters: < > & " will be converted, if Raw2Ent "thinks" that this
characters are no elements of the character-entity-codes or HTML-Tags. This
works the best, if the HTML-File contains "good" code. I cannot guarantee a
correct interpreatation by Raw2Ent, but I think it